AITopics | proof step

Collaborating Authors

proof step

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

NATURALPROVER: Grounded Mathematical Proof Generation with Language Models

Neural Information Processing SystemsApr-25-2026, 00:49:47 GMT

Theorem proving in natural mathematical language - the mixture of symbolic and natural language used by humans - plays a central role in mathematical advances and education, and tests aspects of reasoning that are core to intelligence. Yet it has remained underexplored with modern generative models. We study largescale language models on two new generation tasks: suggesting the next step in a mathematical proof, and full proof generation. We develop NATURALPROVER,a language model that generates proofs by conditioning on background references (e.g.

large language model, logic & formal reasoning, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Law (0.93)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.66)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
(2 more...)

Add feedback

Proving Theorems Recursively Haiming Wang

Neural Information Processing SystemsFeb-16-2026, 23:51:56 GMT

Recent advances in automated theorem proving leverages language models to explore expanded search spaces by step-by-step proof generation. However, such approaches are usually based on short-sighted heuristics (e.g., log probability or value function scores) that potentially lead to suboptimal or even distracting sub-goals, preventing us from finding longer proofs. To address this challenge, we propose POETRY (PrOvE Theorems RecursivelY), which proves theorems in a recursive, level-by-level manner in the Isabelle theorem prover. Unlike previous step-by-step methods, POETRY searches for a verifiable sketch of the proof at each level and focuses on solving the current level's theorem or conjecture. Detailed proofs of intermediate conjectures within the sketch are temporarily replaced by a placeholder tactic called sorry, deferring their proofs to subsequent levels. This approach allows the theorem to be tackled incrementally by outlining the overall theorem at the first level and then solving the intermediate conjectures at deeper levels. Experiments are conducted on the miniF2F and PISA datasets and significant performance gains are observed in our POETRY approach over state-of-the-art methods. POETRY on miniF2F achieves an average proving success rate improvement of 5. 1% . Moreover, we observe a substantial increase in the maximum proof length found by POETRY, from 10 to 26 .

large language model, logic & formal reasoning, machine learning, (22 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Guangdong Province > Guangzhou (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

62c6d7893b13a13c659cb815852dd00d-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-15-2026, 10:41:27 GMT

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > China > Hong Kong (0.04)
Europe > Italy > Lazio > Rome (0.04)
(2 more...)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.99)

Add feedback

62c6d7893b13a13c659cb815852dd00d-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-15-2026, 10:41:24 GMT

large language model, machine learning, programming language, (25 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > China > Hong Kong (0.04)
(13 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

MeasuringSystematicGeneralization inNeuralProofGenerationwithTransformers

Neural Information Processing SystemsFeb-11-2026, 05:53:55 GMT

Specifically,weperform soft theorem-proving by leveraging TLMs to generate natural language proofs. We test the generated proofs for logical consistency, along with the accuracy of the final inference.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Quebec (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(3 more...)

Genre: Research Report (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.31)

Add feedback

LearningtoProveTheoremsbyLearningtoGenerate TheoremsSupplementaryMaterial AnonymousAuthor(s) Affiliation Address email

Neural Information Processing SystemsFeb-10-2026, 12:59:04 GMT

The prover runs in iterations. In each iteration, it69 travels down from the root node.

artificial intelligence, machine learning, theorem, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.49)

Add feedback

d2a27e83d429f0dcae6b937cf440aeb1-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 12:58:57 GMT

proof step, prover, theorem, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > North Carolina > Wake County > Morrisville (0.04)
North America > Canada (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.69)

Add feedback

Thor: WieldingHammerstoIntegrateLanguage ModelsandAutomatedTheoremProvers

Neural Information Processing SystemsFeb-8-2026, 08:04:51 GMT

In theorem proving, the task of selecting useful premises from alarge library to unlock the proof of a given conjecture is crucially important. This presents a challenge foralltheorem provers,especially theonesbasedonlanguage models, due to their relative inability to reason over huge volumes of premises in text form.

logic & formal reasoning, machine learning, urlhttp, (20 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Bremen > Bremen (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(3 more...)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.48)

Add feedback

1fc548a8243ad06616eee731e0572927-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 20:25:47 GMT

large language model, logic & formal reasoning, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > Dominican Republic (0.04)
North America > Canada (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Law (0.93)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
(2 more...)

Add feedback

HERMES: Towards Efficient and Verifiable Mathematical Reasoning in LLMs

Ospanov, Azim, Feng, Zijin, Sun, Jiacheng, Bai, Haoli, Shen, Xin, Farnia, Farzan

arXiv.org Artificial IntelligenceNov-25-2025

Informal mathematics has been central to modern large language model (LLM) reasoning, offering flexibility and enabling efficient construction of arguments. However, purely informal reasoning is prone to logical gaps and subtle errors that are difficult to detect and correct. In contrast, formal theorem proving provides rigorous, verifiable mathematical reasoning, where each inference step is checked by a trusted compiler in systems such as Lean, but lacks the exploratory freedom of informal problem solving. This mismatch leaves current LLM-based math agents without a principled way to combine the strengths of both paradigms. In this work, we introduce Hermes, the first tool-assisted agent that explicitly interleaves informal reasoning with formally verified proof steps in Lean. The framework performs intermediate formal checking to prevent reasoning drift and employs a memory module that maintains proof continuity across long, multi-step reasoning chains, enabling both exploration and verification within a single workflow. We evaluate Hermes on four challenging mathematical reasoning benchmarks using LLMs of varying parameter scales, from small models to state-of-the-art systems. Across all settings, Hermes reliably improves the reasoning accuracy of base models while substantially reducing token usage and computational cost compared to reward-based approaches. On difficult datasets such as AIME'25, Hermes achieves up to a 67% accuracy improvement while using 80% fewer total inference FLOPs. The implementation and codebase are publicly available at https://github.com/aziksh-ospanov/HERMES.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2511.1876

Country: